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Abstract 

In this work we offer anOdV^liillk) pseudo-polynomial time deterministic algorithm for 
solving the Value Problem and Optimal Strategy Synthesis in Mean Payoff Games. This im¬ 
proves by a factor log(|V| W) the best previously known pseudo-polynomial time upper bound 
due to Brim, et al. The improvement hinges on a suitable characterization of values, and a 
description of optimal positional strategies, in terms of reweighted Energy Games and Small 
Energy-Progress Measures. 
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1 Introduction 

A Mean Payoff Game (MPG) is a two-player infinite game E := {V,E,w, (VoVi)), which is played 
on a finite weighted directed graph, denoted := {V,E,w), the vertices of which are partitioned 
into two classes, Vq and Vi , according to the player to which they belong. It is assumed that has 
no sink vertex and that the weights of the arcs are integers, i.e., w : E ^ {—VP,... ,0,..., VP} for 
some VP S N. 

At the beginning of the game a pebble is placed on some vertex v* G V, and then the two players, 
named Player 0 and Player 1, move the pebble ad infinitum along the arcs. Assuming the pebble 
is currently on Player O’s vertex v, then he chooses an arc (v, v') G E going out of v and moves 
the pebble to the destination vertex v'. Similarly, assuming the pebble is currently on Player I’s 
vertex, then it is her turn to choose an outgoing arc. The infinite sequence v^, v, v'... of all the en¬ 
countered vertices is a play. In order to play well. Player 0 wants to maximize the limit inferior of 
the long-run average weight of the traversed arcs, i.e., to maximize liminf„^«, vv(v,',Vj+i), 

whereas Player 1 wants to minimize the limsup„^„„ ;7L"=o tv(v,,v,+i). Ehrenfeucht and Myciel- 
ski 1^ proved that each vertex v admits a value, denoted val'"(v), which each player can secure 
by means of a memoryless (or positional) strategy, i.e., a strategy that depends only on the current 
vertex position and not on the previous choices. 
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Solving an MPG consists in computing the values of all vertices (Value Problem) and, for 
each player, a positional strategy that secures such values to that player (Optimal Strategy Syn¬ 
thesis). The corresponding decision problem lies in NPncoNP G) and it was later shown by 
Jurdzihski Q to be recognizable with unambiguous polynomial time non-deterministic Turing 
Machines, thus falling within the UP n coUP complexity class. 

The problem of devising efficient algorithms for solving MPGs has been studied extensively 
in the literature. The first milestone was that of Gurvich, Karzanov and Khachiyan 0, in which 
they offered an exponential time algorithm for solving a slightly wider class of MPGs called Cyclic 
Games. Afterwards, Zwick and Paterson G) devised the first deterministic procedure for comput¬ 
ing values in MPGs, and optimal strategies securing them, within a pseudo-polynomial time and 
polynomial space. In particular, Zwick and Paterson established an 6)(|y plisl VT) upper bound for 
the time complexity of the Value Problem, as well as an upper bound of (9(|y |'^|£’| W logdEl/jy |)) 
for that of Optimal Strategy Synthesis G)- 

Recently, several research efforts have been spent in studying quantitative extensions of infi¬ 
nite games for modeling quantitative aspects of reactive systems M- In this context quantities 
may represent, for example, the power usage of an embedded component, or the buffer size of 
a networking element. These studies have brought to light interesting connections with MPGs. 
Remarkably, they have recently led to the design of faster procedures for solving them. In par¬ 
ticular, Brim, et al. Q devised faster deterministic algorithms for solving the Value Problem and 
Optimal Strategy Synthesis in MPGs within 6)(|y plZsl iyiog(|y | IT)) pseudo-polynomial time and 
polynomial space. 

To the best of our knowledge, this is the tightest pseudo-polynomial upper bound on the time 
complexity of MPGs which is currently known. 

Indeed, a wide spectrum of different approaches have been investigated in the literature. For in¬ 
stance, Andersson and Vorobyov ||Tj provided a fast sub-exponential time randomized algorithm for 

solving MPGs, whose time complexity can be bounded as GdyplFl exp(2 |y | IndFl/ + 

(9(v^j^-f Inliil))). Furthermore, Lifshits and Pavlov |j^ devised an 0(2^^^ |y| iFl log IT) singly- 
exponential time deterministic procedure by considering the potential theory of MPGs. 

These results are summarized in Table [T] 

Table 1: Complexity of the main Algorithms for solving MPGs. 


Optimal 

Algorithm Value Problem Strategy Note 

Synthesis 

This 

V 

3 

11 

9 

1 

/ork 0{\V\^\E\W) 0{\V\^\E\W) Determ. 

0{\V\^\E\W\og{\V\W)) 0{\V\^\E\W\og{\V\W)) Determ. 

0dy|d£|iy) 0dyd|£|iyiogjf|) Determ. 

(9(2l'^l |y| iFl logiy) n/a Determ. 

/ ^ 2./|V|ln(^)+0(vT^+ln|£|)\ 

Vf £ e V j same complexity Random. 


Contribution. The main contribution of this work is that to provide an GdyplFliy) pseudo¬ 
polynomial time and (9dy|) space deterministic algorithm for solving the Value Problem and Op- 
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timal Strategy Synthesis in MPGs. As already mentioned in the introduction, the best previously 
known procedure has a deterministic time complexity of (9(|yPlisllklog(|y|Vk)), which is due to 
Brim, et al. Q. In this way we improve the best previously known pseudo-polynomial time upper 
bound by a factor log(|y | Vk). This result is summarized in the following theorem. 

Theorem 1. There exists a deterministic algorithm for solving the Value Problem and Optimal 
Strategy Synthesis of MPGs within OflyPlfil Vk) time and 6>f|y|) space, on any input MPG F = 
(y,£, w, (yo,yi)). Here, Ik = max,e£ |w,|. 

In order to prove Theorem [T] this work points out a novel and suitable characterization of 
values, and a description of optimal positional strategies, in terms of certain reweighting operations 
that we will introduce later on in Section 

In particular, we will show that the optimal value val^(v) of any vertex v is the unique rational 
number v for which v “transits” from the winning region of Player 0 to that of Player 1, with 
respect to reweightings of the form w — V. This intuition will be clarified later on in Section]^ 
where Theorem|^is formally proved. 

Concerning strategies, we will show that an optimal positional strategy for each vertex v G ko 
is given by any arc (v, v') G E which is compatible with certain Small Energy-Progress Measures 
(SEPMs) of the above mentioned reweighted arenas. This fact is formally proved in Theoremj^of 
Section[3 

These novel observations are smooth, simple, and their proofs rely on elementary arguments. 
We believe that they contribute to clarifying the interesting relationship between values, optimal 
strategies and reweighting operations (with respect to some previous literature, see e.g. 00)- 
Indeed, they will allow us to prove Theorem 0 

Organization. This manuscript is organized as follows. In Section]^ we introduce some notation 
and provide the required background on infinite two-player games and related algorithmic results. 
In Section]^ a suitable relation between values, optimal strategies, and certain reweighting oper¬ 
ations is investigated. In Section^ an (9(|yPlEjfk) pseudo-polynomial time and <9(|y|) space 
algorithm, for solving the Value Problem and Optimal Strategies Synthesis in MPGs, is designed 
and analyzed by relying on the results presented in Section 0 In this manner. Section factually 
provides a proof of Theorem^which is our main result in this work. 

2 Notation and Preliminaries 

We denote by N, Z, Q the set of natural, integer, and rational numbers (respectively). It will be 
sufficient to consider integral intervals, e.g., [a,b] := {z G Z \ a < z < b} and [a,b) := {z G Z | a < 
z< b} for any a,b GZ. 

Weighted Graphs. Our graphs are directed and weighted on the arcs. Thus, if G = {V,E,w) is a 
graph, then every arc e G £ is a triplet e = {u, v,We), where Wg = w{u, v) G Z is the weight of e. The 
maximum absolute weight is Vk ;= maxgg^ \we\- Given a vertex u GV, the set of its successors is 
post(M) = {v G y I (m, v) G £}, whereas the set of its predecessors is pre(M) = {v G k | (v, m) G £}. 
A path is a sequence of vertices vqVi ... v„ ... such that (v,-, v,+i) G E for every i. We denote by V* 
the set of all (possibly empty) finite paths. A simple path is a finite path vo^i ...v„ having no 
repetitions, i.e., for any i,j G [0,n] it holds v, f vj whenever i f j. The length of a simple path 
p = vovi... v„ equals n and it is denoted by |p |. A cycle is a path vqvi ... v„_ i v„ such that vq ... v„_ i 
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is simple and v„ = vq. The length of a cycle C = vqvi ... equals n and it is denoted by |C|. The 
average weight of a cycle vq • • • is ^ >v(v,, ). A cycle C = vqVi ... y„ is reachable from v 

in G if there exists a simple path p = vu\.. .Umin G such that pHC ^Qi. 

Arenas. An arena is a tuple T = {V,E,w,{Vo,Vi)) where G^ := {V,E,w) is a finite weighted 
directed graph and (Vb, Vi) is a partition of V into the set Vq of vertices owned by Player 0, and the 
set Vi of vertices owned by Player 1. It is assumed that G^ has no sink, i.e., post(v) ^ 0 for every 
V S V; still, we remark that is not required to be a bipartite graph on colour classes Vq and Vi. 
Fig.[2depicts an example. 


-1 



Figure 1: An arena F. 

A game on F is played for infinitely many rounds by two players moving a pebble along the 
arcs of G^. At the beginning of the game we find the pebble on some vertex v* S V, which is called 
the starting position of the game. At each turn, assuming the pebble is currently on a vertex v € V, 
(for i — 0,1), Player i chooses an arc (v, v') S E and then the next turn starts with the pebble on v'. 

A play is any infinite path vo''i... v« ... S V* in F. For any i G {0,1}, a strategy of Player i 
is any function CJ, : V* x Vi ^ V such that for every finite path p'v in G^, where p' G V* and 
V G Vi, it holds that (v, v)) G E. A strategy CJ, of Player i is positional (or memoryless) if 

CTi{p,v„) = Oi{p',v'^) for every finite paths pv„ = vq ... v„-iv„ and p'v'^ = vj,. ■ • in G*' such 

that v„ = G Vi- The set of all the positional strategies of Player i is denoted by A play 
vqvi ... v„ ... is consistent with a strategy a G E,- if = (7(voVi ■■■Vj) whenever Vj G Vi. 

Given a starting position Vj G V, the outcome of strategies Ob G Eg and ffi G Ei, denoted 
outcome’'(vi, Ob, di), is the unique play that starts at v* and is consistent with both Oq and Oi. 

Given a memory less strategy o, G Zf of Player i in F, then GJ^. = iy,Ea^,w) is the graph 
obtained from G'' by removing all the arcs (v, v') G E such that v G V,- and v' ^ 0;(v); we say that 
G^. is obtained from G*^ by projection w.r.t. o, . 

Concluding this subsection, the notion of reweighting is recalled. For any weight function 
w,w' —>Z, the reweighting of T = {V,E,w, (VbjFi)) w.r.t. w' is the arena F*^' = {V,E,w',{Vo,Vi)). 

Also, for w : £ —> Z and any v G Z, we denote by w + V the weight function w' defined as 
Wg := We+ V for every e G E. Indeed, we shall consider reweighted games of the form 
for some q GQ. Notice that the corresponding weight function w': E Q : e Wg — q is rational, 
while we required the weights of the arcs to be always integers. To overcome this issue, it is suf¬ 
ficient to re-define by scaling all the weights by a factor equal to the denominator of q' G Q, 
namely, to re-define: F"'^^ := where A,Z) G N are such that q = N/D and gcd(A,D) = 1. 

This re-scaling will not change the winning regions of the corresponding games, and it has the 
significant advantage of allowing for a discussion (and an algorithmics) which is strictly based on 
integer weights. 
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Mean Payoff Games. A Mean Payojf Game (MPG) is a game played on some arena 

r for infinitely many rounds by two opponents, Player 0 gains a payoff defined as the long-run 
average weight of the play, whereas Player 1 loses that value. Formally, the Player O’s payojf of a 
play VQVi... v„... in r is dehned as follows: 

2 n—\ 

MPo(voVi ...v„...) :=liminf- V w(v,-,v,+i). 

n “ 

The value secured by a strategy Oq G Eq in a vertex v is dehned as: 

val°^“(v) := inf MPo(outcome''(v,do,(Ji)), 

aiGll 

Notice that payoffs and secured values can be dehned symmetrically for the Player 1 (i.e., by 
interchanging the symbol 0 with 1 and inf with sup). 

Ehrenfeucht and Mycielski Q proved that each vertex v G V admits a unique value, denoted 
val’'(v), which each player can secure by means of a memoryless (or positional) strategy. More¬ 
over, uniform positional optimal strategies do exist for both players, in the sense that for each player 
there exist at least one positional strategy which can be used to secure all the optimal values, inde¬ 
pendently with respect to the starting position v*. Thus, for every MPG F, there exists a strategy 
Ob G such that val°o(v) > val''(v) for every v G V, and there exists a strategy Oi G such 
that val°^* (v) < val''(v) for every v G F. Indeed, the (optimal) value of a vertex v G V in the MPG 
F is given by: 

val''(v) = sup val‘^*’(v) = inf val°^'(v). 

OqGZo 

Thus, a strategy Oo G Eq is optimal if val°o(v) = val''(v) for all v G V. A strategy Oo G Eq is said 
to be winning for Player 0 if val°o(v) > 0, and Oi G Ei is winning for Player 1 if val*^* (v) < 0. 
Correspondingly, a vertex v G V is a winning starting position for Player 0 if val^(v) > 0, otherwise 
it is winning for Player 1. The set of all winning starting positions of Player i is denoted by Wi for 
/G{0,1}. 



Figure 2: An MPG on F, played from left to right, whose payoff equals = 0. 

A hnite variant of MPGs is well-known in the literature ||^|^[TT]. Flere, the game stops as 
soon as a cyclic sequence of vertices is traversed (i.e., as soon as one of the two players moves 
the pebble into a previously visited vertex). It turns out that this hnite variant is equivalent to the 
inhnite one Q. Specihcally, the values of an MPG are in relationship with the average weights of 
its cycles, as stated in the next lemma. 

Lemma 1 (Brim, et al. (3)). Let F = {V,E,w, (VqjFi)) be an MPG. For all V G Q, for all positional 
strategies Ob S E^ of Player 0, and for all vertices v G V, the value val^°{v) is greater than v if 
and only if all cycles C reachable from v in the projection graph have an average weight 
w(C)/|C| greater than v. 
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The proof of Lemma [T] follows from the memoryless determinacy of MPGs. We remark that 
a proposition which is symmetric to Lemma holds for Player 1 as well; for all V G Q, for all 
positional strategies (7i G Tif of Player 1, and for all vertices v G V, the value val'^i (v) is less than 
V if and only if all cycles reachable from v in the projection graph have an average weight less 
than V. 

Also, it is well-known |[^|^ that each value val'^(v) is contained within the following set of 
rational numbers; 


5r = 


DG [l,|y|],AG [-DW,DW] 


Notice that |5r| < 


The present work tackles on the algorithmics of the following two classical problems; 

• Value Problem. Compute for each vertex v G V the (rational) optimal value val''(y). 

• Optimal Strategy Synthesis. Compute an optimal positional strategy Oq G 

Currently, the asymptotically fastest pseudo-polynomial time algorithm for solving both prob¬ 
lems is a deterministic procedure whose time complexity is G(|y PlLj W log(|y | W)) |j^. This 
result has been achieved by devising a binary-search procedure that ultimately reduces the Value 
Problem and Optimal Strategy Synthesis to the resolution of yet another family of games known as 
the Energy Games. Even though we do not rely on binary-search in the present work, and thus we 
will introduce some truly novel ideas that diverge from the previous solutions, still, we will reduce 
to solving multiple instances of Energy Games. Eor this reason, the Energy Games are recalled in 
the next paragraph. 


Energy Games and Small Energy-Progress Measures. An Energy Game (EG) is a game that 
is played on an arena E for infinitely many rounds by two opponents, where the goal of Player 0 is 
to construct an infinite play vqvi ... v„ ... such that for some initial credit c G N the following holds; 

c-f ^w(v,-,v,+i) > 0, for all j > 0. (1) 

!=0 

Given a credit c G N, a play vqv\ ... v„ ... is winning for Player 0 if it satisfies (1), otherwise 
it is winning for Player 1. A vertex v G y is a winning starting position for Player 0 if there 
exists an initial credit c G N and a strategy Oq G Eq such that, for every strategy a\ G Ei, the play 
outcome’'(v, ObjCTi) is winning for Player 0. As in the case of MPGs, the EGs are memoryless 
determined Q, i.e., for every v G y, either v is winning for Player 0 or v is winning for Player 1, 
and (uniform) memoryless strategies are sufficient to win the game. In fact, as shown in the next 
lemma, the decision problems of MPGs and EGs are intimately related. 

Lemma 2 (Brim, et al. @)- Let r = {V,E^w, (yo,yi)) be an arena. For all threshold V G for 
all vertices v GV, Player 0 has a strategy in the MPG E that secures value at least V from v if and 
only if for some initial credit c G N, Player 0 has a winning strategy from v in the reweighted EG 

pw—V 

In this work we are especially interested in the Minimum Credit Problem (MCP) for EGs; for 
each winning starting position v, compute the minimum initial credit c* = c* (v) such that there 
exists a winning strategy Ob G E^ for Player 0 starting from v. A fast pseudo-polynomial time 
deterministic procedure for solving MCPs comes from Q. 
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Theorem 2 (Brim, et al. Q). There exists a deterministic algorithm for solving the MCP within 
(9(|y| I^IVT) pseudo-polynomial time, on any input EG {y,E,w, (VojVi)). 

The algorithm mentioned in Theorem]^ is the Value-Iteration algorithm analyzed by Brim, et 
al. in 0. Its rationale relies on the notion of Small Energy-Progress Measures (SEPMs). These are 
bounded, non-negative and integer-valued functions that impose local conditions to ensure global 
properties on the arena, in particular, witnessing that Player 0 has a way to enforce conservativity 
(i.e., non-negativity of cycles) in the resulting game’s graph. Recovering standard notation, see 
e.g. 0, let us denote = {n G N | n < |y | IT} U {T} and let f be the total order on defined 

as .r ^ y if and only if either y = T or .r,y G N and x<y. 

In order to cast the minus operation to range over let us consider an operator 0 : x Z — 

defined as follows: 


_f max(0,a —f?) , if a 7 ^ T and a — < |y | IT; 

} a 0 = T , otherwise. 

Given an EG E on vertex set y = Vo U yi, a function / : y —is a Small Energy-Progress 
Measure (SEPM) for E if and only if the following two conditions are met: 

1. if V G Eo, then /(v) ^ /(v') 0 w(v, v') for some (v, v') G E\ 

2. if V G Vi, then /(v) 0 /(v') 0 w(v,v') for all (v,v') G E. 

The values of a SEPM, i.e., the elements of the image f{V), are called the energy levels of /. 
It is worth to denote by E/ = {v G E | /(v) fT] the set of vertices having finite energy. Given a 

SEPM / and a vertex v G Eo, an arc (v, v') G £ is said to be compatible with f whenever f{v) 0 

/(v') 0 w(v, v'); moreover, a positional strategy Cq G is said to be compatible with f whenever 
for all V G Eq, if (v) = v' then (v, v') G £ is compatible with /. Notice that, as mentioned in gif 
/ and g are SEPMs, then so is the minimum function defined as: h{v) = min{/(v),g(v)} for every 
V G y. This fact allows one to consider the least SEPM, namely, the unique SEPM /* : E —> 
such that, for any other SEPM g : E —the following holds: f* (v) 0 g(v) for every v G E. Also 
concerning SEPMs, we shall rely on the following lemmata. The first one relates SEPMs to the 
winning region ^ of Player 0 in EGs. 

Lemma 3 (Brim, et al. 0). £efE= (E,£,w,(Eo,El)) he an £G. 

1. If f is any SEPM of the EG E and v G E/, then v is a winning starting position for Player 0 
in the EG E. Stated otherwise, Vf C Wq; 

2. If f* is the least SEPM of the EG E, and v is a winning starting position for Player 0 in the 
EG E, then v G E/». Thus, Vf* = Wq. 

Also notice that the following bound holds on the energy levels of any SEPM (actually by 
definition of “^r)- 

Lemma 4. Let E = (E,£, w, (Eo,Ei)) be an EG. Let f be any SEPM o/E. Then, for every v G E 
either f{v) = T or 0 < /(v) < |E|1E. 
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Value-Iteration Algorithm. The algorithm devised by Brim, et al. for solving the MCP in EGs 
is known as Value-Iteration Q. Given an EG E as input, the Value-Iteration aims to compute the 
least SEPM f* of E. This simple procedure basically relies on a lifting operator 5. Given v GV, 
the lifting operator 5(-, v) : [V —>■ —>■ [V —>■ “^r] is defined by 5{f,v) = g, where; 

f /(m) if m 7^ V 

g(M) = < min{/(v') 0 w(v, v') I v'G post(v)} if m = v G Vb 

[ max{/(v')0w(v,v') I v'G post(v)} if m = v G Vi 

We also need the following definition. Given a function f :V ^ we say that / is inconsis¬ 
tent in V whenever one of the following two holds; 

1. V G Vo and/or aZZ v'G post(v) it holds f{v)< /(v') 0 w(v,v'); 

2. V G Vi and there exists v' G post(v) such that f{y) -< f{v') 0 w(y, v'). 

To start with, the Value-Iteration algorithm initializes / to the constant zero function, i.e., 
/(v) = 0 for every v G V. Eurthermore, the procedure maintains a list L of vertices in order to 
witness the inconsistencies of /. Initially, v G Vq H L if and only if all arcs going out of v are 
negative, while v G Vi H L if and only if v is the source of at least one negative arc. Notice that 
checking the above conditions takes time GdEj). 

As long as the list L is nonempty, the algorithm picks a vertex v from L and performs the 
following; 

1. Apply the lifting operator 5 (/, v) to / in order to resolve the inconsistency of / in v; 

2. Insert into L all vertices u G pre(v) \ L witnessing a new inconsistency due to the increase of 

fiv). 

(The same vertex can’t occur twice in L, i.e., there are no duplicate vertices in L.) 

The algorithm terminates when L is empty. This concludes the description of the Value-Iteration 
algorithm. 

As shown in ||^, the update of L following an application of the lifting operator 5 (/, v) requires 
(9(|pre(v)|) time. Moreover, a single application of the lifting operator 5(-,v) takes (9(|post(v)|) 
time at most. This implies that the algorithm can be implemented so that it will always halt within 
0{\V\ |E|1V) time (the reader is referred to ||^ in order to grasp all the details of the proof of 
correctness and complexity). 

Remark. The Value-Iteration procedure lends itself to the following basic generalization, which 
turns out to be of a pivotal importance in order to best suit our technical needs. Let /* be the least 
SEPM of the EG E. Recall that, as a first step, the Value-Iteration algorithm initializes / to be 
the constant zero function. Here, we remark that it is not necessary to do that really. Indeed, it is 
sufficient to initialize / to be any function /o which bounds f* from below, that is to say, to initialize 
/ to any /o ; V —>■ such that /o(v) 0 /*(v) for every v G V. Soon after, L can be initialized in a 
natural way; just insert v into L if and only if /o is inconsistent at v. This initialization still requires 
GdEj) time and it doesn’t affect the correctness of the procedure. 

So, we shall assume to have at our disposal a procedure named Value-IterationO, which 
takes as input an EG E = {V,E,w, (Vq) Vi)) and an initial function fo that bounds from below the 
least SEPM f* of the EG E (i.e., s.t. /o(v) 0 /*(v) for every v G V). Then, Value-IterationO 
outputs the least SEPM /* of the EG E within GdV| |E| W) time and working with (9dV|) space. 
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3 Values and Optimal Positional Strategies from Reweightings 

This section aims to show that values and optimal positional strategies of MPGs admit a suitable 
description in terms of reweighted arenas. This fact will be the crux for solving the Value Problem 
and Optimal Strategy Synthesis in 6 >(|yplfsl VT) time. 

3.1 On optimal values 

A simple representation of values in terms of Farey sequences is now observed, then, a characteri¬ 
zation of values in terms of reweighted arenas is provided. 

Optimal values and Farey sequences. Recall that each value val’^(v) is contained within the 
following set of rational numbers: 


Sr 


DG [l,|y|],AG [-DW,DW] 


Let us introduce some notation in order to handle 5r in a way that is suitable for our purposes. 
Firstly, we write every v G Sr as v = i + F, where i = iy = [vj is the integral and F = Fy = {v} = 
V — ! is the fractional part. Notice that i G [—IT, IT] and that F is a non-negative rational number 
having denominator at most jyj. 

As a consequence, it is worthwhile to consider the Farey sequence of order « = jyj. This is 
the increasing sequence of all irreducible fractions from the (rational) interval [0,1] with denomi¬ 
nators less than or equal to n. In the rest of this paper, denotes the following sorted set: 

0 < A < D < n,gcd(A,D) = 1 



Farey sequences have numerous and interesting properties, in particular, many algorithms for 
generating the entire sequence in time 0{n^) are known in the literature Q, and these rely on 
Stern-Brocot trees and mediant properties. Notice that the above mentioned quadratic running time 
is optimal, as it is well-known that the sequence has s{n) = ^ -f 0{nlnn) = 0 (n^) terms. 

Throughout the article, we shall assume that Fq, ... ,Fs-i is an increasing ordering of so 
that = {FjlpQ and Fj < Fj+i for every j. 

Also notice that Fq — Q and Fs_i = 1. 

For example, ,^5 = {0, i, i, i, §, i, f1}. 

At this point, Sr can be represented as follows: 

Sr = [-1T,1T) +,^|v| = {i + Fj\iG [-IT,IT), 7 G [0,s- 1]} . 

The above representation of Sr will be convenient in a while. 


Optimal values and reweightings. Two introductory lemmata are shown below, then, a charac¬ 
terization of optimal values in terms of reweightings is provided. 

Lemma 5. Let F = (y,E^w,{VQ,Vi)) be an MPG and let q G Q be a rational number having 
denominator D G N. Then, = ^val^ (v) — q holds for every v G T. 
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Proof. Let us consider the play outcome^Ob, O’]) = vqvi ... v„ ... By the definition of val’'(v), 
and by that of reweighting P‘’+^ (= the following holds: 

val^”^^(v) = supo.^gj;ginfCTjgi;j MPo(outcome^'’'^'’(v,Oo,Oi)) 

= supo.^gj;ginf(jjei;jlitninf„^„i£"ro(D-w(v,-,v,+i)+A^) (ifg'=A^/D) 

= D ■ ™PaoGlo MPo(outcome^(v, Oo, Oi)) + 

= D-val’'(v)+A^. 

Then, val’'(v) = gval''’'^^{v) — g = gval''''^‘^(v) — q holds for every v G V. 

□ 


Lemma 6. Given an MPG T = (y,E,w, (Vb,Li)), let us consider the reweightings: 

Tij = ^ for any i G [—fT, W] and 7 G [0, s — 1], 

where s = |^|v| | and Fj is the j-th term of the Farey sequence ■^\v\. 

Then, the following propositions hold: 

1. For any i G [—VT,1T] and j G [0,s— 1], we have: 

V G #6(r/,;) if and only ifval^{v) > i + Ff 

2. For any i G [—1T,1T] and 7 G [l,i— 1], vve have: 

V G #1 (Tij) if and only if val^ (v) < i + Fj^i. 

Proof 

1. Let us fix arbitrarily some i G [—IT,IT] and 7 G [0,s — 1]. 

Assume that Fj = Nj/Dj for some Nj,Dj G N. 

Since 

r ;,7 = iV,F,Dj{w-i)-Nj,{Vo,Vi)), 
then by Lemma |^( applyed to q' = —i — Fj) we have: 

val''(v) = — val’''j(v) -Pi + Fj. 

Recall that v G #o(r,, 7 ) if and only if val^' t(v) > 0. 

Hence, we have v G l^(r,' 7 ) if and only if the following inequality holds: 

val^(v) = — val^'-t (v) + i + Fj 
D i 


> i + Fj. 


This proves Item[2 
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2. The argument is symmetric to that of Item 1, but with some further observations. 

Let us fix arbitrarily some i G [—IT,IT] and J G [l,s— Ij. Assume that Fj = Nj/Dj for 
some Nj,Dj G N. Since F/j- = {V,E,Dj{w — i) —Nj, (TqjTi)), then by Lemma|^we have 
val’'(v) = 2 y:val^'j(v) +/ + F,'. Recall that v G W\(Tij) if and only if val’''j(v) < 0. 

Hence, we have v S (r, j) if and only if the following inequality holds: 

val^(v) = -^val^''^'(v)+ i + F,' 

< i + Fj. 

Now, recall from Section|^that val’"(v) G 5r, where 

St = {iFFj I i G [-IT,IT), j G [0,^-1]}. 


By hypothesis we have: 

j > I and 0 < Fj^i < Fj, 

thus, at this point, v G ITi(r; ^) if and only if val''(v) < i + Fj^\. 
This proves Item]^ 


□ 


We are now in the position to provide a simple characterization of values in terms of reweight¬ 
ings. 

Theorem 3. Given an MPG F = {V,E,w^ (Vb)l^i))> consider the reweightings: 

Tij = ^ for any i G [—IT, IT] and 7 S [1, s — 1], 

where s = j^jvi j tint/ Fj is the j-th term of the Farey sequence ■^\v\- 
Then, the following holds: 

val^ {v) = i + Fj^i if and only ifv G G (r!,;). 

Proof Let us fix arbitrarily some i G [—IT,IT] and j G [l,i— 1]. 

By Item 1 of Lemma ro we have v G l%(F,'.,_i) if and only if val'"(y) > i-\-Fj^\. Symmetri¬ 
cally, by Item 2 of Lemma m we have v G Wi{rij) if and only if val^(v) < i + Fj^i. Whence, by 
composition, v G if and only if val''(y) = i + Fj^i. □ 


3.2 On optimal positional strategies 

The present subsections aims to provide a suitable description of optimal positional strategies in 
terms of reweighted arenas. An introductory lemma is shown next. 

Lemma 7. Let F = {V,E,w, (To,Ti)) be an MPG, the following hold: 

1. Ifv G Vo, letv' G post{v). Then val^{v') < val^ (y) holds. 
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2. Ifv € Vi, let v' G postiy). Then val^{v') > val^(v) holds. 

3. Given any v S Vq, consider the reweighted EG Fy = 

Let /v : y —>■ ‘^r„ tiny SEPM of the EG Fy such that v G Vy (i.e., fviy) fT). Let G V 
be any vertex out ofv such that (v, v'j^) € E is compatible with f, in Fy. 

Then, ) = val^(v). 

Proof. L It is sufficient to construct a strategy cJq G securing to Player 0 a payoff at least 
val''(v') from v in the MPG F. Let Gq € E^ be a strategy securing payoff at least val''(v') 
from v' in F. Then, let ffg be defined as follows: 


cj^(u) = 



, if u€ yo\{v}; 

, if M = V and v is reachable from v' in ,,; 

, if M = V and v is not reachable from v' in G^ 

°o 


We argue that cTq secures payoff at least val^(v') from v in F. First notice that, by Lemma[^ 
(applied to v'), all cycles C that are reachable from v' in F satisfy: 


Hc) 

|c| 


> val''(v^). 


The fact is that any cycle reachable from v in GL. is also reachable from v' in G^ ; (by 

definition of Cq), therefore, the same inequality holds for all cycles reachable from v. At 
this point, the thesis follows again by Lemma[T](applied to v, in the inverse direction). This 
proves Item[2 

2. The proof of Itemj^is symmetric to that of Item[2 

3. Firstly, notice that val'"(vj ) < val^(v) holds by Itemj^ To conclude the proof it is suffi¬ 
cient to show val^(vy > val''(v). Recall that (v, G £ is compatible with /y in Fy by 
hypothesis, that is: 

/v(v) ^ /y(v}Je (w(y,v}J - val''(v)). 

This, together with the fact that v G Vy (i.e., /y(v) f T) also holds by hypothesis, implies that 
G Vf (i.e., /y(v^ ) 7 ^ T). Thus, by Item 1 of Lemmaj^ is a winning starting position 
of Player 0 in the EG Fy. Whence, by Lemmaj^ it holds that val''(vy > val''(v). This 
proves Item|^ 

□ 


We are now in position to provide a sufficient condition, for a positional strategy to be optimal, 
which is expressed in terms of reweighted EGs and their SEPMs. 

Theorem 4. Let F = (y,E,w, (VoiVi)) be an MPG. For each y G V, consider the reweighted EG 
Fy = i^et /y : y —)■ ‘^r,, be any SEPM o/Fy such that y G ( i.e., fviy) 7^ T). Moreover, 

assume: /yj = /y^ whenever r/al^(yi) = ual^(y 2 ). 
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When V € Vo, let v'j^ €V be any vertex such that (v, S £ is compatible with /v in the EG Fy, 
and consider the positional strategy S defined as follows: 

(j* (v) = v'f^ for every v G Vo- 

Then, (7q is an optimal positional strategy for Player 0 in the MPG F. 

Proof Let us consider the projection graph G^* = , w). Let y € V be any vertex. In order to 

prove that cJq is optimal, it is sufficient (by Lemma to show that every cycle C that is reachable 

from V in G^, satishes > val''(y). 

“o I'-l 

• Preliminaries. Let v S V and let C be any cycle of length |C| > 1 that is reachable from v 
in G^*. Then, there exists a path p of length |p| > 1 in G^* and such that: if |p| = 1, then 
p = popi = vv; otherwise, if |p| > 1, then: 

P =Po...P|p| =VVlV 2 ...VkUlU 2 ...U\c\Ul, 
where vvi.. is a simple path, for some k>Q and ui.. .u^cfii = C. 



Figure 3: A cycle C that is reachable from v through yi • • • in G^*. 


• Fact 1. It holds val’^jp,-) < val''(p;+i) for every i G [0, |p|). 

of Fact 1. If Pi G Vo then val^(p;) = val'"(p;+i) by Item 3 of Lemma otherwise, if 
Pi G Vi, then val’'(p, ) < val''(p,+i) by Item 2 of Lemma This proves Fact 1. In par¬ 
ticular, notice that val'^jy) < val^(Mi) when |p| > 1. □ 


• Fact 2. Assume C = ui... u^cfii, then val^(M,) = val’"(Mi) for every i G [0, |C|]. 

of Fact 2. By Fact 1, j val^(M/j for every i G [2, |G|], as well as val^ (“|c|) < 

val^(Mi). Then, the following chain of inequalities holds: 

val''(Mi) < val^(M2) < ... < val^(M|c|) < val^(Mi). 

Since the hrst and the last value of the chain are actually the same, i.e., val^(Mi), then, all 
these inequalities are indeed equalities. This proves Fact 2. □ 
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Fact 3. The following holds for every i G [0, |p |): 

fp,{Pi)JpiiPi+i) + T and /p,(p,) > /p,(p,+i) - w(p,-,p,+i) +val''(p,). 

of Fact 3. Firstly, we argue that any arc (p,,p,+i) € £ is compatible with /p. in Fp^. Indeed, 
if Pi G Vq, then (p,-, p,+i) is compatible with fp. in Fp^ because p,+i = aQ (p,) by hypothesis; 
otherwise, if p,- G Vi, then (p;,x) is compatible with fp. in Fp^. for every x G post(p,), in 
particular for x = p,+i, by definition of SEPM. 

At this point, since (p,, p,+i) is compatible with fp. in Fp., then; 

fp, (Pi) h fp, (Pi+i ) © (w(p;, Pi+i) - val’'(p,)). 

Now, recall that p, G V/^. (i.e., fpfpi) f T) holds for every p,- by hypothesis. Since fpfpf f 
T and the above inequality holds, then we have fpfPi+i) f T. Thus, we can safely write: 

fp, (Pi) > fp, (P;+i) - w{pi,Pi+i) + val''(p,). 

This proves Fact 3. □ 


Fact 4. Assume that the cycle C = u \... M|c|Mi is such that: 

val’"(«,■)= val*' (mi) > val'"(v), for every / G [1,|C|]- 
Then, provided that u\c\+i = «!, the following holds for every i G [1, |C|]: 

i 

/»,(Mi),/«,+i(Mi+i) 7^ T and/„, (mi) >(m,+i) - ^ w(My,Mj+i)+/ • val''(v). 

j=i 


of Fact 4. Firstly, notice that fu^ (mi),/„.^j (ui+i) f T holds by hypothesis. 

The proof proceeds by induction on / G [1, |C|]. 

- Base Case. Assume that |C| = 1, so that C = mimi. Then/„j (mi) > fu^ (mi) — w(mi,mi) + 
val’'(Mi) follows by Fact 3. Since val’'(Mi) > val’'(v) by hypothesis, then the thesis 
follows. 

- Inductive Step. Assume by induction hypothesis that the following holds: 

/—I 

/«i( mi) >/».(«()- Y^w{uj,uj+i) + {i- 1) •val‘'(v). 

./=i 

By Fact 3, we have: 

fui [ ui ) > fui (m,+i) - w(m,-, m,+i) + val''(M,). 

Since val^(M,+i) = val’'(M,) holds by hypothesis, then we have = /„.. Recall 
that val''(M,) > val’'(v) also holds by hypothesis. 

Thus, we obtain the following: 

i 

/»,(mi) >/»,+i («!■+!)- ^ w(M,-,M;+i) + /-val''(v). 

i=i 


This proves Fact 4. 



□ 


• We are now in position to show that every cycle C that is reachable from v in satisfies 

w(C)/|C| > val^(v). By Fact 1 and Fact 2, we have val’'(v) < val''(Mi) = val^(Mj) for 
every i G [1, |C|]. At this point, we apply Fact 4. Consider the specialization of Fact 4 when 
i = |C| and also recall that M|c|+i = «!■ Then, we have the following: 

|C| 

/mi (mi) >/i,|(Mi)- l^w(M,-,My+i) + |C|-val''(v). 
y=i 

As a consequence, the following lower bound holds on the average weight of C: 

w(C) 1 ^ / N \ 


which concludes the proof. 


□ 


Remark 1. Notice that Theorem holds, in particular, when /v is the least SEPM f* of the 
reweighted EG Fy. This follows because v € Vf* always holds for the least SEPM f* of the EG 
Fy, as shown next: by Lemma^and by definition o/Fy, then v is a winning starting position for 
Player 0 in the EG Fy (for some initial credit); now, since f* is the least SEPM of the EG Fy, then 
V G Vf* follows by Item 2 ofLemma^ 


4 An 0(|y p|£| W) time Algorithm for solving the Value Prob¬ 
lem and Optimal Strategy Synthesis in MPGs 


This section offers a deterministic algorithm for solving the Value Problem and Optimal Strat¬ 
egy Synthesis of MPGs within GdVplFlVT) time and G(|V|) space, on any input MPG F = 
(y,£,w,(Vo,Vi)). 

Let us now recall some notation in order describe the algorithm in a suitable way. 

Given an MPG F = (VjF, w, (VbjVi)), consider again the following re weightings: 

Tij = for any i G [—1V,1V] and j G [0,s — 1], 


where s = | and Fj is the j-th term of ■ 

Assuming Fj = Nj/Dj for some Nj,Dj G N, we focus on the following weights: 

■ P ■ Nj 

Wij=w-i-Fj = w-i-—; 

Kj =^J {w-i)- Nj. 

Recall that r, y is defined as Tij := which is an arena having integer weights. Also notice 
that, since Fq < ... < 1 is monotone increasing, then the corresponding weight functions w, j 
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can be ordered in a natural way, i.e., W-w,i > M'_vr ,2 > ■ ■ ■ > > ... > ww,s-i- In the rest 

of this section, we denote hy f*, :V ^ ^r, j the least SEPM of the reweighted EG r,j. Moreover, 

the function f*j : E —Q, defined as fij{v) := jr/w' (t') for every v S V, is called the rational 
scaling of f*, . 


4.1 Description of the Algorithm 

In this section we shall describe a procedure whose pseudo-code is given below in Algorithm [T] 
It takes as input an arena E = (y,E,w, (Vo,Vi)), and it aims to return a tuple v,(7q) such 

that; and Wi are the winning regions of Player 0 and Player 1 in the MPG E (respectively), 
V ; E —>■ 5r is a map sending each starting position v € V to its optimal value, i.e., v(v) = val^(v), 
and finally, cJq : Vq —V is an optimal positional strategy for Player 0 in the MPG E. 

The intuition underlying Algorithm [T] is that of considering the following sequence of weights; 

W-W.l > W-W.2 > ■ . . > W-W.i-l > W-W+l.l > M^-W+1,2 > ■ . . > WW-\.s-\ > ■■■> Ww,s-1 

where the key idea is that to rely on Theorem at each one of these steps, testing whether a 
transition of winning regions has occurred. Stated otherwise, the idea is to check, for each vertex 


start 


vG#6(rp,ev(/j))n#)(Ey) 




H 


W-W.l W-W2 W_w.3 • • • >Vprev(ij-) WiJ 


WW-\,S-IWW2 


Eigure 4; An illustration of Algorithmj^ 

V GV, whether v is winning for Player 1 with respect to the current weight w, j, meanwhile recalling 
whether v was winning for Player 0 with respect to the immediately preceding element vCprev); j) in 
the weight sequence above. 

If such a transition occurs, say for some v G ^(rpj.ev{;j)) E #i(E, j), then one can easily 
compute val'"(v) by relying on Theorem]^ Also, at that point, it is easy to compute an optimal 
positional strategy, provided that v G Vq, by relying on Theoremj^and Remark[2in that case. 

Each one of these phases, in which one looks at transitions of winning regions, is named Scan 
Phase. A graphical intuition of Algorithm[2is given in Fig.|^ 
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An in-depth description of the algorithm and of its pseudo-code now follows. 
Algorithm 1; Solving the Value Problem and Strategy Synthesis in MPGs. 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 


Procedure solve_MPG{r) 

input ; anMPGr= (V',£,w,(Vo,Vi)). 

output: a tuple (#o, , v, cTq ) such that: 1^ and Wi are the winning regions of Player 0 and 

Player 1 (respectively) in the MPG F; v : P —> Sp is a map sending each starting position 
V e V to its corresponding optimal value, i.e., v(v) = val^(v); and cTq : Vq is an 
optimal positional strategy for Player 0 in the MPG F. 

// Init Phase 

Wq ^ 0; iTi ^ 0; 

/(v)^0,VveV; 

IV •<— maX(.g£ |we|; w' w + IV; £> •<— 1; 

j <— compute the size |,^|y|| of ^|y|; // with the algorithm of |l0| 

// Scan Phases 
for i = —W to IV do 
F^O; 

for /= 1 to 5 — 1 do 

prev_/^/; 
prev_w ^ w'; 

prev_F <— F; _ 

F ■<— generate the y-th term of =^n/|; // with the algorithm of |l0[ 

N -tr- numerator of F; 

D ^ denominator of F; 
w' ^ D{w — i) — N; 

/■<— ^ Value-Iteration(F''’ , [F)prev_/]); 

for V e V do 

if prev_f(v) ^ T and /(v) = T then 

v(v) <—i + prev_F; // set optimal value v 
if v(v) >0 then 

{v}; // V is winning for Player 0 

else 

1^ 1^1 •(— #1 U {v}; // V is winning for Player 1 
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if V e Vq then 

for u E post(v) do 

if prev_f(v) F prevjiu) © prev_w{v, u) then 
cTq (v) 4- u; break; 


27 


return (Wq,'W\ , v, (Tq ) 


• Initialization Phase. To start with, the algorithm performs an initialization phase. At line[T] 
Algorithm[T]initializes the output variables and to be empty sets. Notice that, within 
the pseudo-code, the variables and Wx represent the winning regions of Player 0 and 
Player 1, respectively; also, the variable v represents the optimal values of the input MPG F, 
and ffQ represents an optimal positional strategy for Player 0 in the input MPG F. Secondly, 
at line[^ an array variable /; F —> is initialized to /(v) = 0 for every v © F; throughout the 

computation, the variable / represents a SEPM. Next, at line[^ the greatest absolute weight 
W is assigned as IV = maXeeE \we\, an auxiliary weight function w' is initialized as w' = 
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w + W, and a “denominator” variable is initialized as D = 1. Concluding the initialization 
phase, at linej^the size (i.e., the total number of terms) of is computed and assigned 
to the variable s. This size can be computed very efficiently with the algorithm devised by 
Pawlewicz and Patra§cu m- 

• Scan Phases. After initialization, the procedure performs multiple Scan Phases. Each one 
of these is indexed by a pair of integers (/, j), where i G [—W, W] (at line|^ and j G [1, i — 1] 
(at line0. Thus, the index i goes from —W to W, and for each i, the index j goes from 1 to 
s— 1 . 

At each step, we say that the algorithm goes through the (/,y)-th scan phase. For each scan 
phase, we also need to consider the previous scan phase, so that the previous index prev(/, 7 ) 
shall be defined as follows: the predecessor of the first index is prev(—VT, 1 ) := (—VT, 0 ); if 
y > 1, then prev(i, J) := ( 1 , 7 — 1 ); finally, if 7 = 1 and i > —W, then prev(l, 7 ) := {i — l,s — 

I). 

At the (/, 7 )-th scan phase, the algorithm considers the rational number Zij G Sr defined as: 

Zi.j:=i + F[J], 


where ^[ 7 ] = Nj/Dj is the 7 -th term of For each 7 , ^[ 7 ] can be computed very ef¬ 

ficiently, on the fly, with the algorithm of Pawlewicz and Patra§cu pO) . Notice that, since 
E[0] < ... < ^"[5 — 1] is monotonically increasing, then the values Zij are scanned in increas¬ 
ing order as well. At this point, the procedure aims to compute the rational scaling f*j of the 
least SEPM/*, , i.e.. 




Di 


This computation is really at the heart of the algorithm and it goes from linej^to line[^ To 
start with, at line and linej^ the previous rational scaling ^7 and the previous weight 

function Wprev(;, 7 ) (i-e-. those considered during the previous scan phase) are saved into the 
auxiliary variables prev_/ and prev_w. 


Remark. Since the values Zi.j are scanned in increasing order of magnitude, then ■pTev_f = 
■fprev{i j) bounds fi'om below /A. That is, it holds for every v GV that: 


prev_/(v) = A fF. 

The underlying intuition, at this point, is that of computing the energy levels of / = /)A 
firstly by initializing them to the energy levels of the previous scan phase, i.e., to prev_/ = 
■fprev{i j)’ update them monotonically upwards by executing the Value-Iteration 

algorithm for EGs. 

Further details of this pivotal step now follow. Firstly, since the Value-Iteration has been 
designed to work with integer numerical weights only Q, then the weights wtj = w — Zij 
have to be scaled from Q to Z: this is performed in the standard way, from line[T^to linefTS] 
by considering the numerator Nj and the denominator Dj of ^[ 7 ], and then by setting: 

w'j j{e) := Dj (w{e) — i) —Nj, for every e GE. 


The initial energy levels are also scaled up from Q to Z by considering the values: \Dj prev_/(v)], 


for every v GV (line 15 1 . At this point the least SEPM of is computed, at line 15 by 
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invoking Value-Iteration(r"''J, [D^prev./]), that is, by executing on input the 
Value-Iteration with initial energy levels given by; [Djprev_/(v)] for every v S V. Soon 
after that, the energy levels have to be scaled back from Z to Q, so that, in summary, at 
line [TS] they becomes: 

f = fij = ^ Value-Iteration(r"'J, [Dyprev./]). 

The correctness of lines [T4p3] will be proved in Lemma[^ 

Here, let us provide a sketch of the argument; 

1. Since Fq < ... < Fg- 1 is monotone increasing, then the sequence {w' j }(ij) is monotone 

decreasing, i.e., for every i,j and e G E, ^ ji^)- Whence, the sequence 

of rational scalings is monotone increasing, i.e., fF < holds at the 

(i, 7 )-th step. The proof is in Lemmaj^ 

2. At the (i,j)-th iteration of line it holds thatpTev_f = jy 

This invariant property is also proved as part of Lemma|^ 

3. Since prev_/ = then prev_/ ^ f*j. 

Thus, one can prove that Dy prev_f F f^i ■ 

4. Since w'- ,■ (e) € Z for every e GE, then /*, (v) gZ for every v G V, so that \Dj prev_f (v)] ^ 

•J w-j 

f*j (v) holds for every v G V as well. 

5. This implies that it is correct to execute the Value-Iteration, on input with initial 
energy levels given by; [Dy prev_/(v)] for every v G V. 

Back to us, once / = f*j has been determined, then for each v G V the condition: 


V G n (k/j), 


is checked at line [T^ it is not difficult to show that, for this, it is sufficient to test whether 
both prev_f (v) ^ T and /(v) = T hold on v (it follows by Lemmaj^. 


If V G ^(rprev(i, 7 )) Fl Wi(Fij) holds, then the algorithm relies on Theorem in order to 
assign the optimal value as follows: v(v) := val^(v) = Zprev(!. 7 ) (linell^. If v(v) > 0, then 
V is added to the winning region Wq at line 20 Otherwise, v(v) <0 andv is added to W\ at 
linel22] 


To conclude, from line 23 to line 27 the algorithm proceeds as follows: if v G Vb. then it 
computes an optimal positional strategy cJq (v) for Player 0 in F; this is done by testing for 
each u G post(v) whether (v, m) G £ is an arc compatible with prev_/ in Fpj.a.j,(, j); namely, 
whether the following holds for some u G post(v); 


? 

prev_/(v) ^ prev_/(M) 0prev_w(v,M). 


If (v, m) G £ is found to be compatible with prev_/ at that point, then (7q(v) := u gets as¬ 
signed and the arc (v, m) becomes part of the optimal positional strategy returned to output. 
Indeed, the correctness of such an assignment relies on Theoreml^and Remark[T] 

This concludes the description of the scan phases and also that of Algorithmic 
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4.2 Proof of Correctness 

Now we formally prove the correctness of Algorithm [T] The following lemma shows some basic 
invariants that are maintained throughout the computation. 

Lemma 8. Algorithm^keeps the following invariants throughout the computation: 

1. For every i G [—VT, IT] and every G [1, i — 1], it holds that: 

fprev(i,j) ('') ^ fuiv) > for every v S V; 

2. Af the (i,j)-th iteration o/h'ne|^ it holds that: prev_f = 

3. At the {i,j)-th iteration o///ne|^ it holds that: \Djprev_f~\ f^, ; 

4. At the (i,j)-th iteration ofline^T^ it holds that: 

— Value-Iteration{r'^'’j, \Djprev_f~\) =fij- 

1) _• 


Proof. 

• Proof (of /temjlj. Recall that Wij :=w — i — Fj. Since Fo< ... < Fg-i is monotone increasing, 
then; Wij{e) < y (e) holds for every e GE. 

In order to prove the thesis, consider the following function: 

g:y^QU{T}:vh^min {f*rev{ij) ■ 

We show that £>prev(!,;) ^ is a SEPM of ^. There are four cases, according to whether 

V e Vo or V e Vi, and g(y) = y (v) or g{v) = fF (v). 

- Case: v G Vq. Then, the following holds for some u G post(v): 


^prev{i,j) SM ^prev(ij) /prev((,;) (^) 


prev(f,;) 


prev(f,;) 


prev(i,J) 


(v,u) 


^prev{ij) fprev{ij) (^) © ^prev(i, j) 
^ Dpre^(iJ)giu) 0 p (V, U) 


[by^(v) = /p;ev(u)M] 

[by E>prev(/,;)/prev(;,j) = J 

[/*, isSEPMofr"''p-''(-.;)] 

*''prev(/,;) 

[by definition of ^(m)] 
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* Case: g{v) = f*j{v): 


^pr 0 v(l, 7 ) ^piev{ij) 


D 


prev(i,j) 

Di.j 




Dpisv{i,j) \ „ ^prsv{ij) / , ^ 

^ n fw. - -wti{v,u) 


D, 


D. 


‘•j 




D, 


prevjij') 

Da 




[hy g{v) = f*j{v)] 

l^y flj = /:'./D^j] 

[/;,^ is SEPMofr^] 

[hyf*j=f:,./Dij 

[by Wij{v,u) = w\ j{v,u)/Dij] 


= (m) 0 £>prev(,'. j) WiJ (V, u) 

— ^prev(i,7)//',7 (^) 0 ^prev{(,7) ^prev(/,7) (E [^Y ^ij 0 ^prev{/,7)] 

— 0 Wp]-ev(!,j) (E “) [^y ^prev(l, 7 )'^prev((,;) — ^prev(i,;)] 

^ Dprev(ij)8{u) 0 Wprev(i,;) «) definition of ^(m)] 


This means that (v, u) is an arc compatible with £>pre¥(i.;),? in . 

- Case: v S Vi. The same argument shows that (v, m) G £ is compatible with T>prev{i. 7 )^ 
in r”^pre''(i,j) ^ but it holds for all u G post(v) in this case. 

This proves that g is a SEPM of -'). 

Since /*, is the least SEPM of r“p''®''('>t), then; 

''W' ’ 

prev(£;) 


/*/ , (v) ^ Dpre^iU) 8{v), for eveiy v G V. 

prev(/j) 

Since = 'Oprev(ij)/p*rev(;j) and g = then: 

^prev(ij) fprev{i,j) — ^piev(ij) ^™ifprev{ij)^ fij) ' 

Whence 

This proves that /p^.^^^,- (v) 0 holds for every v G E. 

• Fact 1. Next, we prove that if Item|^holds at the (i, 7 )-th scan phase, then both Item|^and 
Itemj^hold at the (i, 7 )-th scan phase as well. 


of Fact 7. Assume that Itemj^holds. Let us prove Itemj^first. Since j'j di fij holds 

by Itemj^ and since prev_/ = holds by hypothesis, then prev_/(v) 0 holds 

for every v G E. Since w'- ■ = DjWij and /*, = Djf*-, then Djprev_f (v) 0 f*, (v) holds 
for every v G E. Since w\ (e) G Z for every e G E, then /*, (v) G Z for every v G E, so that 
[Djprev_f (v)] 0 /*, (v) holds for every v G E as well. This proves Itempl 

We show Itemj^now. Since Item[^holds, at line[T^it is correct to initialize the starting energy 
levels of Value-IterationO to [Djprev_/(v)] for every v G E, in order to execute the 
Value-Iteration on input t. 
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This implies the following: 


Value-Iteratioii(r’'^''-', [Dyprev_/]) =/*/ . 


Weknowthatj^/;;,^/;^.. 

This proves that Itemj^holds and concludes the proof of Fact 1. □ 


• Fact 2. We now prove that Item|^holds at each iteration of line|^ 
of Fact 2. The proof proceeds by induction on (i,f). 

Base Case. Let us consider the first iteration of line|^ i.e., the iteration indexed by i = —W 
and 7 = 1. Recall that, at line of Algorithm the function / is initialized as /(v) =0 for 
every v CV. Notice that / is really the least SEPM /*iy q of because every 

arc e S £ has a non-negative weight in i.e., We + VT > 0 for every e G E. 

Hence, at the first iteration of line[^ the following holds: 

prev_/ = 0 = f_^o = . 


Inductive Step. Let us assume that Item holds for the prev(/, 7 )-th iteration, and let us 
prove it for the (/, 7 )-th one. Hereafter, let us denote {ip,jp) = pTev{i,j) for convenience. 
Since Item|2 holds for the {ip,jp)-th iteration by induction hypothesis, then, by Fact 1, the 
following holds at the {ip, jp)-th iteration of line 15 


Value-Iteration(r% >p , \Dj^ prev_/]) = / = f-j. 

^ip 

Thus, at the (i, 7 )-th iteration of line[^ 


prev_/ = / = f*^j^ = 


This concludes the proof of Fact 2. 


□ 


At this point, by Fact 1 and Fact 2, Lemma [^follows. 


□ 


We are now in the position to show that Algorithm [T] is correct. 

Proposition 1. Assume that Algorithm^is invoked on input T = {y,E,w,(yQ,V\)) and, whence, 
that it returns {Wq, W\,V, Ob) as output. 

Then, Wvi and Wx are the winning sets of Player 0 and Player 1 in F (respectively), V : V —> 5 
is such that v(v) = valF (v) for every v £V, and Oo : Vo F A tin optimal positional strategy for 
Player 0 in the MPG F. 
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Proof. At the (!, 7 )-th iteration of line[^ the following holds by Lemmaj^ 

prev-/ = /;rev(ij) and / = f*j. 

Our aim now is that to apply Theorem]^ For this, firstly observe that one can safely write prev_/ = 
In fact, since /p = 0 and Fs-\ = 1, then: 

n^prev(/,i) = = w-i = Wifi, for every i e [-W,W]. 


This implies that Wpj-ev(,j) = n'/j-i for every i G [—IT,IT] and j G [l,i — 1]. 

Whence, prev_f = 

So, at the {i,j)-th iteration of line 17 the following holds for every v G T: 

prev_/(v) f T and /(v) =T iff f T and f*j{v) = T [by Lemmaj^ 

iff V G #o(r,j_i) n Wx {Tij) [by Item 1-2 of Lemmaj^ 
iff val’'(y) = i-\-Fj^\ [by Theorem]^ 

This implies that, at the {i, 7 )-th iteration of line[^ Algorithm[^correctly assigns the value v(v) = 
/ + F '[7 — 1] = i + Fj^\ to the vertex v. 

Since for every vertex v G T we have val'"(v) G Sr (recall that Sr admits the following rep¬ 
resentation 5r = {i + Fj I i G [—IT,IT), j G [0,s— 1]}), then, as soon as Algorithmj^halts, v(v) = 
val’'(v) correctly holds for every v G T. In turn, at line 20 and at line|^ the winning sets Wq and 
Wx are correctly assigned as well. 

Now, let us assume that v(v) = i+Fj^x holds at the {i,j)-th iteration of line 18 for some v G T. 
Then, the following holds on prev_w at line|^ 

prev_w = Wprev{,j) =Wij-x = w-i-Fj^x = w - v(y) = w - val’'(v). 


Thus, at the {i,j)-th iteration of line 25 for every y G To and « G post(y): 

prev_/(y) A prev_/(M) 0 prev_vy(y, u) iff (v) A f*reviij) («) © (>v - val^(y)) 

iff (y,M) is compatible with /p*rev(,',;) in 


Recall that is the least SEPM of P’' vaP(y)^ by Theorem|^the following implication 

holds: if (y, m) is compatible within r»'-''aA(v)^ ab(y) = m is an optimal positional 
strategy for Player 0, at y, in the MPG F. 

This implies that line|^of Algorithm [T] is correct and concludes the proof. □ 


4.3 Complexity Analysis 

The present section aims to show that Algorithm [^always halts in OdTplFlIT) time. This upper 
bound is established in the next proposition. 

Proposition 2. Algorithm^always halts within OdTplFj IT) time and it works with (9dT|) space, 
on any input MPG F = {V,E,w, (To,Ti)). Here, IT = maxegg \we\. 
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Proof. (Time Complexity of the Init Phase) The initialization of v,0() (at line[T]) and that 

of / (at line[^ takes time (9(|y|). The initialization of IT at line|^takes Odfsl) time. To conclude, 
the size s = of the Farey sequence (i.e., its total number of terms) can be computed in 


log'/^n) time as shown by Pawlewicz and Patra§cu in | |l()) . Whence, the Init phase of 
Algorithmic takes 0(\E\) time overall. 

(Time Complexity of the Scan Phases) To begin, notice that there are 0{\V\^W) scan phases 
overall. In fact, at line the index i goes from —W to W, while at line the index j goes from 
0 to s — 1 where s = \-^\v\ \ = Observe that, at each iteration, it takes OdFl) time to go 

from line [C to line and then from line 16 to line|^ In particular, at line[^ the j-th term Fj 
of the Farey sequence ^\v\ can be computed in 0(rn^\Qi^^^ n) time as shown by Pawlewicz and 
Patra§cu in |T0). 


Now, let us denote by the time taken by the (/, 7 )-th iteration of line 


15 


that is the 


time it takes to execute the Value-Iteration algorithm on input F 't with initial energy levels: 

Then, the {i,j)-th scan phase always completes within the following time bound: 

0{\E\) + J^. 

We now focus on 7j^ and argue that the (aggregate) total cost LjTj^of executing the Value- 
Iteration algorithm for EGs at line 


15 


(throughout all scan phases) is only OdVplFl W). Stated 
otherwise, we aim to show that the amortized cost of executing the {i,j)-th scan phase is only 

Oi\E\). 

Recall that the Value-Iteration algorithm for EGs consists, as a first step, into an initialization 
(which takes 0{\E\) time) and, then, in the continuous iteration of the following two operations: (1) 
the application of the lifting operator 5{f,v) (which takes (9dpost(v)|) time) in order to resolve 
the inconsistency of / in v, where f{v) represents the current energy level and v G V is any vertex 
at which / is inconsistent; and (2) the update of the list L (which takes (9dpre(v)|) time), in order 
to keep track of all the vertices that witness an inconsistency. Recall that L contains no duplicates. 


At this point, since at the (/, 7 )-th iteration of line 15 the Value-Iteration is executed on input 


F then a scaling factor on the maximum absolute weight W must be taken into account. Indeed, 
it holds that: 

W' :=max||w-/e)| eeE,i€ [-W,W], je [0,s-1]| = 0(1^1 W). 

Remark. Actually, since w'i j'.= Dj(w — i) —Nj (where Nj/Dj = Fj G =^|v|), then the scaling factor 
Dj changes from iteration to iteration. Still, Dj < |y | holds for every j. 

At each application of the lifting operator 5 (/, v) the energy level /(v) increases by at least 
one unit with respect to the scaled-up maximum absolute weight W’. Stated otherwise, at each 
application of 5{f,v), the energy level /(v) increases by at least l/|y| units with respect to the 
original weight W. 

Throughout the whole computation, the rational scalings of the energy levels never decrease by 
Lemma]^ in fact, at the (i,y)-th scan phase. Algorithm [T| executes the Value-Iteration with initial 


energy levels: \Dj j.^~\. Whence, at line 15 the (i,j)-th execution of the Value-Iteration 


starts from the (carefully scaled-up) energy levels of the prev(i, j)-th execution; roughly speaking, 
no energy gets ever lost during this process. Then, by Lemma each energy level /(v) can be 
lifted-up at most |y|iy' = 0{\V\'^W) times. 

The above observations imply that the (aggregate) total cost of executing the Value-Iteration at 
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line [TS] (throughout all scan phases) can be bounded as follows: 


E #= E o(|£|) 

\ -W<i<W^ —V—" 

\\<j<s-\ initcost 

= 0{SV\^\E\W) + 0{\V\^W) ^ 6>(|post(v)I + |pre(v)|) 


-W<i<W 


^ 6>( |post(v)| + |pre(v)| ) 0(|Vj W') 
lifting 5 update V. Lemma^ 


vev 

=o{\v\^\m) 

Whence, Algorithm[2always halts within the following time bound: 

TlME[solve_MPG(r)] = £ (o{E) + 1^=0{\V\^\E\W). 

^ ' -W<i<W ^ ' 

\<j<s-\ 

This concludes the proof of the time complexity bound. 

We now turn our attention to the space complexity. 

(Space Complexity) First of all, although the Farey sequence has l^jyjl = 0(|y|^) many 
elements, still. Algorithm [T] works fine assuming that every next element of the sequence is gen¬ 


erated on the fly at line 11 This computation can be computed in (9(|y|^/^log^/^ |y|) sub-linear 


time and space as shown by Pawlewicz and Patra§cu |Tg. Secondly, given i and j, it is not neces¬ 
sary to actually store all weights w'- j{e) := Dj{w{e) — i) —Nj for every e G E, as one can compute 
them on the fly provided that Nj, Dj, w and e are given. Finally, Algorithm needs to store in 


memory the two SEPMs / and old_/, but this requires only <9(|y|) space. Finally, at line 15 the 


Value-Iteration algorithm employs only (9(|y |) space. In fact the list L, which it maintains in order 
to keep track of inconsistencies, doesn’t contain duplicate vertices and, therefore, its length is at 
most |L| < |y|. These facts imply altogether that Algorithmworks with (9(|y |) space. □ 


5 Conclusions 

In this work we proved an OdyplEl VT) pseudo-polynomial time upper bound for the Value Prob¬ 
lem and Optimal Strategy Synthesis in Mean Payoff Games. The result was achieved by providing 
a suitable description of values and positional strategies in terms of reweighted Energy Games and 
Small Energy-Progress Measures. 

On this way we ask whether further improvements are not too far away. 
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